{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Deep<span style=\"color:green\">|</span>Bayes summer school. Practical session on EM algorithm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "One of the school organisers decided to prank us and hid all games for our Thursday Game Night somewhere.\n",
    "\n",
    "Let's find the prankster!\n",
    "\n",
    "When you recognize [him or her](http://deepbayes.ru/#speakers), send:\n",
    "* name\n",
    "* reconstructed photo\n",
    "* this notebook with your code (doesn't matter how awful it is :)\n",
    "\n",
    "__privately__ to [Nadia Chirkova](https://www.facebook.com/nadiinchi) at Facebook or to info@deepbayes.ru. The first three participants will receive a present. Do not make spoilers to other participants!\n",
    "\n",
    "Please, note that you have only __one attempt__ to send a message!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from matplotlib import pyplot as plt\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "DATA_FILE = \"data_em\"\n",
    "w = 73 # face_width"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We are given a set of $K$ images with shape $H \\times W$.\n",
    "\n",
    "It is represented by a numpy-array with shape $H \\times W \\times K$:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "X = np.load(DATA_FILE)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X.shape # H, W, K"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Example of noisy image:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.imshow(X[:, :, 0], cmap=\"Greys_r\")\n",
    "plt.axis(\"off\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Goal and plan"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Our goal is to find face $F$ ($H \\times w$).\n",
    "\n",
    "Also, we will find:\n",
    "* $B$: background  ($H \\times W$)\n",
    "* $s$: noise standard deviation (float)\n",
    "* $a$: discrete prior over face positions ($W-w+1$)\n",
    "* $q(d)$: discrete posterior over face positions for each image  (($W-w+1$) x $K$)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Implementation plan:\n",
    "1. calculating $log\\, p(X  \\mid d,\\,F,\\,B,\\,s)$\n",
    "1. calculating objective\n",
    "1. E-step: finding $q(d)$\n",
    "1. M-step: estimating $F,\\, B, \\,s, \\,a$\n",
    "1. composing EM-algorithm from E- and M-step\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Implementation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "### Variables to test implementation\n",
    "tH, tW, tw, tK = 2, 3, 1, 2\n",
    "tX = np.arange(tH*tW*tK).reshape(tH, tW, tK)\n",
    "tF = np.arange(tH*tw).reshape(tH, tw)\n",
    "tB = np.arange(tH*tW).reshape(tH, tW)\n",
    "ts = 0.1\n",
    "ta = np.arange(1, (tW-tw+1)+1)\n",
    "ta = ta / ta.sum()\n",
    "tq = np.arange(1, (tW-tw+1)*tK+1).reshape(tW-tw+1, tK)\n",
    "tq = tq / tq.sum(axis=0)[np.newaxis, :]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1. Implement calculate_log_probability\n",
    "For $k$-th image $X_k$ and some face position $d_k$:\n",
    "$$p(X_k  \\mid d_k,\\,F,\\,B,\\,s) = \\prod_{ij}\n",
    "    \\begin{cases} \n",
    "    \t\\mathcal{N}(X_k[i,j]\\mid F[i,\\,j-d_k],\\,s^2), \n",
    "    \t& \\text{if}\\, (i,j)\\in faceArea(d_k)\\\\\n",
    "    \t\\mathcal{N}(X_k[i,j]\\mid B[i,j],\\,s^2), & \\text{else}\n",
    "    \\end{cases}$$\n",
    "\n",
    "Important notes:\n",
    "* Do not forget about logarithm!\n",
    "* This implementation should use no more than 1 cycle!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def calculate_log_probability(X, F, B, s):\n",
    "    \"\"\"\n",
    "    Calculates log p(X_k|d_k, F, B, s) for all images X_k in X and\n",
    "    all possible face position d_k.\n",
    "\n",
    "    Parameters\n",
    "    ----------\n",
    "    X : array, shape (H, W, K)\n",
    "        K images of size H x W.\n",
    "    F : array, shape (H, w)\n",
    "        Estimate of prankster's face.\n",
    "    B : array, shape (H, W)\n",
    "        Estimate of background.\n",
    "    s : float\n",
    "        Estimate of standard deviation of Gaussian noise.\n",
    "\n",
    "    Returns\n",
    "    -------\n",
    "    ll : array, shape(W-w+1, K)\n",
    "        ll[dw, k] - log-likelihood of observing image X_k given\n",
    "        that the prankster's face F is located at position dw\n",
    "    \"\"\"\n",
    "    # your code here"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# run this cell to test your implementation\n",
    "expected = np.array([[-3541.69812064, -5541.69812064],\n",
    "       [-4541.69812064, -6741.69812064],\n",
    "       [-6141.69812064, -8541.69812064]])\n",
    "actual = calculate_log_probability(tX, tF, tB, ts)\n",
    "assert np.allclose(actual, expected)\n",
    "print(\"OK\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 2. Implement calculate_lower_bound\n",
    "$$\\mathcal{L}(q, \\,F, \\,B,\\, s,\\, a) = \\sum_k \\biggl (\\mathbb{E} _ {q( d_k)}\\bigl ( \\log p(  X_{k}  \\mid {d}_{k} , \\,F,\\,B,\\,s) + \n",
    "    \\log p( d_k  \\mid a)\\bigr) - \\mathbb{E} _ {q( d_k)} \\log q( d_k)\\biggr) $$\n",
    "    \n",
    "Important notes:\n",
    "* Use already implemented calculate_log_probability! \n",
    "* Note that distributions $q( d_k)$ and $p( d_k  \\mid a)$ are discrete. For example, $P(d_k=i \\mid a) = a[i]$.\n",
    "* This implementation should not use cycles!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def calculate_lower_bound(X, F, B, s, a, q):\n",
    "    \"\"\"\n",
    "    Calculates the lower bound L(q, F, B, s, a) for \n",
    "    the marginal log likelihood.\n",
    "\n",
    "    Parameters\n",
    "    ----------\n",
    "    X : array, shape (H, W, K)\n",
    "        K images of size H x W.\n",
    "    F : array, shape (H, w)\n",
    "        Estimate of prankster's face.\n",
    "    B : array, shape (H, W)\n",
    "        Estimate of background.\n",
    "    s : float\n",
    "        Estimate of standard deviation of Gaussian noise.\n",
    "    a : array, shape (W-w+1)\n",
    "        Estimate of prior on position of face in any image.\n",
    "    q : array\n",
    "        q[dw, k] - estimate of posterior \n",
    "                   of position dw\n",
    "                   of prankster's face given image Xk\n",
    "\n",
    "    Returns\n",
    "    -------\n",
    "    L : float\n",
    "        The lower bound L(q, F, B, s, a) \n",
    "        for the marginal log likelihood.\n",
    "    \"\"\"\n",
    "    # your code here"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# run this cell to test your implementation\n",
    "expected = -12761.1875\n",
    "actual = calculate_lower_bound(tX, tF, tB, ts, ta, tq)\n",
    "assert np.allclose(actual, expected)\n",
    "print(\"OK\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 3. Implement E-step\n",
    "$$q(d_k) = p(d_k \\mid X_k, \\,F, \\,B, \\,s,\\, a) = \n",
    "\\frac {p(  X_{k}  \\mid {d}_{k} , \\,F,\\,B,\\,s)\\, p(d_k \\mid a)}\n",
    "{\\sum_{d'_k} p(  X_{k}  \\mid d'_k , \\,F,\\,B,\\,s) \\,p(d'_k \\mid a)}$$\n",
    "\n",
    "Important notes:\n",
    "* Use already implemented calculate_log_probability!\n",
    "* For computational stability, perform all computations with logarithmic values and apply exp only before return. Also, we recommend using this trick:\n",
    "$$\\beta_i = \\log{p_i(\\dots)} \\quad\\rightarrow \\quad\n",
    "\t\\frac{e^{\\beta_i}}{\\sum_k e^{\\beta_k}} = \n",
    "\t\\frac{e^{(\\beta_i - \\max_j \\beta_j)}}{\\sum_k e^{(\\beta_k- \\max_j \\beta_j)}}$$\n",
    "* This implementation should not use cycles!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def run_e_step(X, F, B, s, a):\n",
    "    \"\"\"\n",
    "    Given the current esitmate of the parameters, for each image Xk\n",
    "    esitmates the probability p(d_k|X_k, F, B, s, a).\n",
    "\n",
    "    Parameters\n",
    "    ----------\n",
    "    X : array, shape(H, W, K)\n",
    "        K images of size H x W.\n",
    "    F  : array_like, shape(H, w)\n",
    "        Estimate of prankster's face.\n",
    "    B : array shape(H, W)\n",
    "        Estimate of background.\n",
    "    s : float\n",
    "        Eestimate of standard deviation of Gaussian noise.\n",
    "    a : array, shape(W-w+1)\n",
    "        Estimate of prior on face position in any image.\n",
    "\n",
    "    Returns\n",
    "    -------\n",
    "    q : array\n",
    "        shape (W-w+1, K)\n",
    "        q[dw, k] - estimate of posterior of position dw\n",
    "        of prankster's face given image Xk\n",
    "    \"\"\"\n",
    "    # your code here"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# run this cell to test your implementation\n",
    "expected = np.array([[ 1.,  1.],\n",
    "                   [ 0.,  0.],\n",
    "                   [ 0.,  0.]])\n",
    "actual = run_e_step(tX, tF, tB, ts, ta)\n",
    "assert np.allclose(actual, expected)\n",
    "print(\"OK\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 4. Implement M-step\n",
    "$$a[j] = \\frac{\\sum_k q( d_k = j )}{\\sum_{j'}  \\sum_{k'} q( d_{k'} = j')}$$\n",
    "$$F[i, m] = \\frac 1 K  \\sum_k \\sum_{d_k} q(d_k)\\, X^k[i,\\, m+d_k]$$\n",
    "$$B[i, j] = \\frac {\\sum_k \\sum_{ d_k:\\, (i, \\,j) \\,\\not\\in faceArea(d_k)} q(d_k)\\, X^k[i, j]} \n",
    "\t  \t{\\sum_k \\sum_{d_k: \\,(i, \\,j)\\, \\not\\in faceArea(d_k)} q(d_k)}$$\n",
    "$$s^2 = \\frac 1 {HWK}   \\sum_k \\sum_{d_k} q(d_k)\n",
    "\t  \t\\sum_{i,\\, j}  (X^k[i, \\,j] - Model^{d_k}[i, \\,j])^2$$\n",
    "\n",
    "where $Model^{d_k}[i, j]$ is an image composed from background and face located at $d_k$.\n",
    "\n",
    "Important notes:\n",
    "* Update parameters in the following order: $a$, $F$, $B$, $s$.\n",
    "* When the parameter is updated, its __new__ value is used to update other parameters.\n",
    "* This implementation should use no more than 3 cycles and no embedded cycles!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def run_m_step(X, q, w):\n",
    "    \"\"\"\n",
    "    Estimates F, B, s, a given esitmate of posteriors defined by q.\n",
    "\n",
    "    Parameters\n",
    "    ----------\n",
    "    X : array, shape (H, W, K)\n",
    "        K images of size H x W.\n",
    "    q  :\n",
    "        q[dw, k] - estimate of posterior of position dw\n",
    "                   of prankster's face given image Xk\n",
    "    w : int\n",
    "        Face mask width.\n",
    "\n",
    "    Returns\n",
    "    -------\n",
    "    F : array, shape (H, w)\n",
    "        Estimate of prankster's face.\n",
    "    B : array, shape (H, W)\n",
    "        Estimate of background.\n",
    "    s : float\n",
    "        Estimate of standard deviation of Gaussian noise.\n",
    "    a : array, shape (W-w+1)\n",
    "        Estimate of prior on position of face in any image.\n",
    "    \"\"\"\n",
    "    # your code here"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# run this cell to test your implementation\n",
    "expected = [np.array([[ 3.27777778],\n",
    "                      [ 9.27777778]]),\n",
    " np.array([[  0.48387097,   2.5       ,   4.52941176],\n",
    "           [  6.48387097,   8.5       ,  10.52941176]]),\n",
    "  0.94868,\n",
    " np.array([ 0.13888889,  0.33333333,  0.52777778])]\n",
    "actual = run_m_step(tX, tq, tw)\n",
    "for a, e in zip(actual, expected):\n",
    "    assert np.allclose(a, e)\n",
    "print(\"OK\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 5. Implement EM_algorithm\n",
    "Initialize parameters, if they are not passed, and then repeat E- and M-steps till convergence.\n",
    "\n",
    "Please note that $\\mathcal{L}(q, \\,F, \\,B, \\,s, \\,a)$ must increase after each iteration."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def run_EM(X, w, F=None, B=None, s=None, a=None, tolerance=0.001,\n",
    "           max_iter=50):\n",
    "    \"\"\"\n",
    "    Runs EM loop until the likelihood of observing X given current\n",
    "    estimate of parameters is idempotent as defined by a fixed\n",
    "    tolerance.\n",
    "\n",
    "    Parameters\n",
    "    ----------\n",
    "    X : array, shape (H, W, K)\n",
    "        K images of size H x W.\n",
    "    w : int\n",
    "        Face mask width.\n",
    "    F : array, shape (H, w), optional\n",
    "        Initial estimate of prankster's face.\n",
    "    B : array, shape (H, W), optional\n",
    "        Initial estimate of background.\n",
    "    s : float, optional\n",
    "        Initial estimate of standard deviation of Gaussian noise.\n",
    "    a : array, shape (W-w+1), optional\n",
    "        Initial estimate of prior on position of face in any image.\n",
    "    tolerance : float, optional\n",
    "        Parameter for stopping criterion.\n",
    "    max_iter  : int, optional\n",
    "        Maximum number of iterations.\n",
    "\n",
    "    Returns\n",
    "    -------\n",
    "    F, B, s, a : trained parameters.\n",
    "    LL : array, shape(number_of_iters + 2,)\n",
    "        L(q, F, B, s, a) at initial guess, \n",
    "        after each EM iteration and after\n",
    "        final estimate of posteriors;\n",
    "        number_of_iters is actual number of iterations that was done.\n",
    "    \"\"\"\n",
    "    H, W, N = X.shape\n",
    "    if F is None:\n",
    "        F = np.random.randint(0, 255, (H, w))\n",
    "    if B is None:\n",
    "        B = np.random.randint(0, 255, (H, W))\n",
    "    if a is None:\n",
    "        a = np.ones(W - w + 1)\n",
    "        a /= np.sum(a)\n",
    "    if s is None:\n",
    "        s = np.random.rand()*pow(64,2)\n",
    "    # your code here\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# run this cell to test your implementation\n",
    "res = run_EM(tX, tw, max_iter=3)\n",
    "LL = res[-1]\n",
    "assert np.alltrue(LL[1:] - LL[:-1] > 0)\n",
    "print(\"OK\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Who is the prankster?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To speed up the computation, we will perform 5 iterations over small subset of images and then gradually increase the subset.\n",
    "\n",
    "If everything is implemented correctly, you will recognize the prankster (remember he is the one from [DeepBayes team](http://deepbayes.ru/#speakers)).\n",
    "\n",
    "Run EM-algorithm:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def show(F, i=1, n=1):\n",
    "    \"\"\"\n",
    "    shows face F at subplot i out of n\n",
    "    \"\"\"\n",
    "    plt.subplot(1, n, i)\n",
    "    plt.imshow(F, cmap=\"Greys_r\")\n",
    "    plt.axis(\"off\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "F, B, s, a = [None] * 4\n",
    "LL = []\n",
    "lens = [50, 100, 300, 500, 1000]\n",
    "iters = [5, 1, 1, 1, 1]\n",
    "plt.figure(figsize=(20, 5))\n",
    "for i, (l, it) in enumerate(zip(lens, iters)):\n",
    "    F, B, s, a, _ = run_EM(X[:, :, :l], w, F, B, s, a, max_iter=it)\n",
    "    show(F, i+1, 5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And this is the background:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show(B)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Optional part: hard-EM"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you have some time left, you can implement simplified version of EM-algorithm called hard-EM. In hard-EM, instead of finding posterior distribution $p(d_k|X_k, F, B, s, A)$ at E-step, we just remember its argmax $\\tilde d_k$ for each image $k$. Thus, the distribution q is replaced with a singular distribution:\n",
    "$$q(d_k) = \\begin{cases} 1, \\, if d_k = \\tilde d_k \\\\ 0, \\, otherwise\\end{cases}$$\n",
    "This modification simplifies formulas for $\\mathcal{L}$ and M-step and speeds their computation up. However, the convergence of hard-EM is usually slow.\n",
    "\n",
    "If you implement hard-EM, add binary flag hard_EM to the parameters of the following functions:\n",
    "* calculate_lower_bound\n",
    "* run_e_step\n",
    "* run_m_step\n",
    "* run_EM\n",
    "\n",
    "After implementation, compare overall computation time for EM and hard-EM till recognizable F."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
